Skip to content

feat: agent CLI — JSON envelopes, cloud run/jobs, fragments, projects, CQL, structured errors#470

Merged
skishore23 merged 115 commits into
mainfrom
agent-cli
Jun 22, 2026
Merged

feat: agent CLI — JSON envelopes, cloud run/jobs, fragments, projects, CQL, structured errors#470
skishore23 merged 115 commits into
mainfrom
agent-cli

Conversation

@skishore23

@skishore23 skishore23 commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Summary

Turns comfy-cli into an agent-first CLI for ComfyUI — letting an agent (or a human) build, validate, run, and review image/video/audio workflows on a local server or Comfy Cloud, entirely from the terminal.

Three ideas hold it together:

  1. One machine contract — every command emits the same versioned JSON envelope, and every error carries a registered code plus an actionable hint (an error is a navigation signal toward the fix, not just a failure). Agents never parse prose.
  2. A compile model for workflows — author small typed fragments (or decompose an existing workflow into them) and compose them with YAML blueprints into one graph. The folders are source; the compiled workflow is a build artifact you never hand-edit.
  3. Async by default — submit returns immediately, a detached watcher tracks the job, and any later process resumes from a state file.

What's in it

  • Machine output (output/) — JSON/NDJSON envelopes, a schema registry, comfy discover; auto-JSON on non-TTY. Errors inherit their code's registered hint automatically.
  • Composition (fragments.py, command/workflow_fragments.py) — fragments + blueprints, foreach fan-out, and the $asset/$item/$var/$alias reference algebra. comfy workflow decompose is the inverse of compose: it projects any workflow into self-documenting fragment source (records where it came from and how to edit it).
  • Validation (cql/) — a pure-Python engine validates a graph against the live server's object_info before spending. Combo option types are preserved so the published schema matches what the backend accepts, and enum rejections carry the full, untruncated valid-options list.
  • Execution (command/jobs.py, jobs_state.py) — async run, detached watcher, resumable job state, item-named outputs.
  • Project convention (project.py) — the project/1 layout, assets push + content-addressed lock, run journal, and --where routing (OAuth-first credentials).
  • comfy preview — render a previewable image from any media: image → thumbnail, video → contact sheet, audio → waveform.
  • Skills (skills/) — bundled SKILL.md files that teach agents to drive the CLI (operate · build · debug · present).
  • Install robustnesscomfy update / comfy install no longer assume the workspace interpreter ships pip; a pip-less uv-managed venv is bootstrapped via uv or ensurepip (no-op when pip is present).

Compatibility

No breaking changes to existing human-facing commands — pretty (TTY) output is unchanged. Everything new is additive: gated behind --json, new subcommands, or strictly-more-correct schema/error output.

How the contract stays honest

AST-scan ratchet tests fail the build if: a command emits without a registered schema; an error code is raised-but-unregistered, registered-but-never-raised, or missing a navigation hint; or credentials are read outside the single resolver.

Notes for reviewers

Large branch — it lands the full agent-CLI surface, so it's best reviewed by area (the package list above maps to the sections of work). The core domain logic (fragments.py, cql/engine.py) is pure value-in/value-out and unit-tested in isolation; command/ shells are thin I/O wrappers.

Verification

  • Full test suite green: 2306 passed, 35 skipped.
  • CI runs build, test, ruff (lint + format), CodeQL, Socket, Codecov, and cross-platform (ubuntu / macOS / windows) + GPU e2e.

skishore23 and others added 30 commits May 22, 2026 13:47
Add a caller-aware rendering system that auto-detects agents vs humans:
- JSON envelope contract for every command (ok, data, error, hint)
- Rich-panel branding with gradient wordmark for interactive use
- Glyph system mapping cloud job statuses to user-facing icons
- JSON schemas for every envelope shape (env, which, run, jobs, etc.)
- Error-code registry with structured hints for agent recovery
- File-lock helper for concurrent job-state access
Add cloud authentication and dual local/cloud routing:
- OAuth 2.1 Authorization Code + PKCE flow with localhost callback
- Credential store for OAuth sessions and third-party API keys
- Unified ComfyClient that routes to local server or Comfy Cloud
- `comfy cloud login/logout/whoami/set-base-url` commands
- `comfy auth set/remove/list` for third-party tokens (HuggingFace, etc.)
- Auto-detection of auth method (OAuth vs API key) with clear precedence
- Constants aligned with cloud seed migration (client_id, scopes, paths)
Add a typed query language for exploring ComfyUI's 3400+ nodes:
- CQL (ComfyUI Query Language) with boolean predicates and pipelines
- WASM-based graph engine for validation and path-finding
- `comfy nodes search/show/ls/upstream/downstream/path/types/categories`
- Pre-submit workflow validation against the node catalog
- Unified object_info loader for local server and cloud
Add a complete workflow lifecycle:
- Async-by-default submit returning prompt_id + state file instantly
- Background job watcher that polls and writes state transitions to disk
- `comfy jobs ls/status/watch` for tracking and live-tailing
- `comfy workflow slots/set-slot/vary` for in-place editing and sweeps
- `comfy upload/download` for file transfer with auth handled internally
- Pre-flight validation of class_type and partner-API nodes before submit
Add ten composable SKILL.md files that teach AI agents to drive comfy:
- Core, cloud, debug, image, video, audio, edit, condition, pipeline, relay
- `comfy skill install/uninstall/list/status` for managing agent configs
- `comfy templates search/show/slots` for discovering workflow templates
- Skills auto-install into Claude Code, Cursor, and AGENTS.md
- Add `comfy setup` interactive wizard with --non-interactive mode for CI
- Wire all new subcommands into the CLI entry point
- Add wasmtime dependency for WASM-based graph engine
- Update .gitignore to use /projects/ for local creative work
- Fix subcommand module re-exports for eager discovery

Amp-Thread-ID: https://ampcode.com/threads/T-019e50f8-759d-745f-b20d-9330daba39d7
Co-authored-by: Amp <amp@ampcode.com>
- Block redirects on upload/download to prevent Bearer/API-Key replay
  to redirect targets (uses dedicated no-redirect opener)
- Sanitize multipart filenames per RFC 7578 to prevent Content-Disposition
  header injection via crafted filenames
- Validate download URLs are http/https before fetching to prevent SSRF
  via crafted state files or API responses

Amp-Thread-ID: https://ampcode.com/threads/T-019e50f8-759d-745f-b20d-9330daba39d7
Co-authored-by: Amp <amp@ampcode.com>
…vents

- Add PostHog provider alongside Mixpanel with atexit flush
- Honor DO_NOT_TRACK and COMFY_NO_TELEMETRY env vars
- Add execution_start/execution_success/execution_error lifecycle to `comfy run`
- Add generate:start/success/error/submitted lifecycle to `comfy generate`
- Add filter_command_kwargs() to redact sensitive args from analytics
- Add posthog>=6,<8 dependency
- Port tracking tests and lifecycle tests from main

Amp-Thread-ID: https://ampcode.com/threads/T-019e50f8-759d-745f-b20d-9330daba39d7
Co-authored-by: Amp <amp@ampcode.com>
…de safety

- _classify_api_workflow: scan all keys for class_type, not just first key.
  Workflows with metadata keys like _meta no longer rejected as invalid.

- validate_workflow: emit non_node_key warnings for entries without
  class_type instead of silently skipping them.

- Cloud prompt_rejected: parse node_errors into readable per-node hint
  lines (e.g. 'node 16 (ElevenLabsVoiceSelector): voice not in list')
  instead of dumping raw JSON. Guard against null errors arrays.

- Preflight: suppress pretty-print warnings in JSON mode to prevent
  Rich-formatted text from corrupting the NDJSON stdout stream.

- Tests: 6 new tests covering non-node keys, classify edge cases,
  and warning schema assertions. All 1771 tests pass.

Amp-Thread-ID: https://ampcode.com/threads/T-019e58cf-de52-74b5-9c0e-0d2860fe5626
Co-authored-by: Amp <amp@ampcode.com>
…ewal

Fragments/compose:
- Split workflow_fragments into a pure domain core (comfy_cli/fragments.py)
  and a thin Typer shell; standardize "recipe" -> "blueprint" everywhere
  (error codes recipe_* -> blueprint_*, payload key, skill docs).
- Support graph-typed sockets (MODEL/CONDITIONING/LATENT/VAE and any custom
  UPPER_SNAKE type) as fragment inputs so complex pipelines can be composed;
  path literals stay loader-injected for IMAGE/MASK/AUDIO/VIDEO only.
- Clearer errors for malformed cross-step refs and invalid step aliases;
  carry output type on _StepOutput so the save node needs no fragment reload.

Cloud poll resilience (comfy_client._request):
- Retry 429 (any method) and transient 5xx (GET only) with Retry-After-aware
  exponential backoff + jitter; bump wait_for_completion poll interval to 2s.
  Fixes `comfy run --wait` aborting on a transient rate-limit mid-job.

OAuth session longevity:
- ensure_fresh_session() proactively refreshes an expired-but-refreshable
  session; wired into cloud_preflight (every cloud command auto-renews) and
  whoami. whoami `signed_in` now reflects token validity, not just presence.

Tests: full suite green (1789 passed, 35 skipped); ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolves 8 of the 9 open Dependabot alerts via lockfile upgrades:
- gitpython 3.1.45 -> 3.1.50 (5 HIGH: RCE via core.hooksPath newline
  injection, command injection via git options, path traversal)
- uv 0.11.6 -> 0.11.17 (MEDIUM: arbitrary file write via entry point names)
- idna 3.10 -> 3.17 (MEDIUM: idna.encode bypass)
- pygments 2.19.2 -> 2.20.0 (LOW: ReDoS in GUID regex)

Pin floors in pyproject: gitpython>=3.1.50, uv>=0.11.15. The re-resolve also
syncs previously-missing declared transitives (posthog 7.x + backoff/distro,
term-image/pillow) that had drifted out of the lockfile.

The 9th alert (comfy-cli < 3.39.2 / ComfyUI-Manager CRLF) is a self/advisory-
mapping artifact against the repo's own 0.0.0 package, not a bumpable dep.

Full suite green against upgraded versions (1789 passed, 35 skipped).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CI installed ruff unpinned (`pip install ruff`) while pre-commit pinned
v0.12.4, so the two drifted and the repo was no longer clean under either.
Pin both to 0.15.15 (current latest) and run a one-time repo-wide
`ruff check --fix` + `ruff format` so CI's lint/format checks pass
deterministically. Also drops a dead local var and hoists a mid-file
import (E402) flagged by the linter.

No behavioral changes — formatting, import ordering, and version pins only.

Co-authored-by: Cursor <cursoragent@cursor.com>
…ixes

- skills: consolidate the bundled library from 10 skills to 4 via
  composition, prune retired skills on every install, and pluralize the
  canonical command to `comfy skills install` (singular kept as alias).
- update: add a non-intrusive upgrade hint on the intro banner (daily
  cache, no background threads) and a `comfy update cli` self-update path.
- feedback: add `comfy feedback` (inline one-shot for agents, interactive
  for TTY) and a hidden, fully consent-gated `comfy agent-review`; wire
  both into PostHog telemetry and register the new error code.
- setup: add an explicit, branded telemetry-consent step to the wizard and
  suppress the global lazy prompt while in `setup`.
- types: fix crash-class issues (possibly-unbound vars, Optional member
  access/subscript/iterable) and tighten signatures (`| None`, casts,
  fail-fast asserts) flagged by pyright.

Adds command-level and unit tests plus a live telemetry verification script.

Co-authored-by: Cursor <cursoragent@cursor.com>
…ck cache

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…again

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…lf-dedupe

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…stead of polling for 6h

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…lled jobs

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…cal and the help text

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…mit)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…instance isolation

Two top-level instances of the same subgraph definition aliased each other:
writing ``10/9.prompt`` mutated the shared definition, so instance 12's
interior node changed too. Fix adds ``_count_instances`` + ``_isolate_shared_subgraph``
helpers that deep-copy the definition under a fresh UUID and repoint the
instance's ``type`` before any interior mutation; a second write to the same
instance sees count==1 and skips the fork.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…lue is legitimately null

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… correctly

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…nly inputs

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…orts out=null

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…rning in the envelope

When resilient_load_object_info falls back to a cached copy after a
failed live fetch, it now fires an on_stale(host_key, error_str) callback.
Both comfy nodes and comfy workflow._load_object_info_or_fail wire this
callback to inject stale=true and warnings=[{code, message}] into the
emitted payload so agents can branch on cache freshness instead of
silently consuming potentially months-old node schemas.

Also registers object_info_stale in the error-code registry (required by
the dict-literal scanner in test_error_code_registry.py even though this
is a warning code, not an error envelope).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…dentity for opted-out users

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@dosubot dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jun 13, 2026
@socket-security

socket-security Bot commented Jun 13, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedjsonschema@​4.26.099100100100100
Addedterm-image@​0.7.299100100100100
Updateduv@​0.11.6 ⏵ 0.11.20100100 +2100100100

View full report

The pytest 'build' job installed only pytest/pytest-cov, but several test
modules (output envelope/schema validation, project, assets-push, auth)
import jsonschema, failing collection with ModuleNotFoundError. Declare
jsonschema in the dev optional-dependencies and install it in the workflow.

Amp-Thread-ID: https://ampcode.com/threads/T-019ebe75-cced-75d2-a7ac-4cce7e22d0f4
Co-authored-by: Amp <amp@ampcode.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
comfy_cli/tracking.py (1)

194-204: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: disable() command enables tracking instead of disabling it, and enable() duplicates its initialization call.

Line 203 invokes init_tracking(True) but should invoke init_tracking(False). This breaks the disable command—users cannot opt out of tracking via the CLI. Line 198 in enable() redundantly calls init_tracking(True) again.

Additionally, the f-string literals in lines 197 and 204 use unnecessary f-string syntax without variable interpolation.

🔧 Proposed fix for both command functions
 `@app.command`()
 def enable():
     init_tracking(True)
-    typer.echo(f"Tracking is now {'enabled'}.")
-    init_tracking(True)
+    typer.echo("Tracking is now enabled.")

 `@app.command`()
 def disable():
-    init_tracking(False)
-    typer.echo(f"Tracking is now {'disabled'}.")
+    init_tracking(False)
+    typer.echo("Tracking is now disabled.")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@comfy_cli/tracking.py` around lines 194 - 204, The disable() command
incorrectly calls init_tracking(True) and enable() calls init_tracking(True)
twice and both echo strings use unnecessary f-strings; update enable() to call
init_tracking(True) only once (remove the duplicate init_tracking(True)), update
disable() to call init_tracking(False) instead of True, and simplify the echo
messages in both enable() and disable() (use plain string literals rather than
f-strings) while keeping the function names enable and disable and the
init_tracking and typer.echo calls intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@comfy_cli/tracking.py`:
- Around line 194-204: The disable() command incorrectly calls
init_tracking(True) and enable() calls init_tracking(True) twice and both echo
strings use unnecessary f-strings; update enable() to call init_tracking(True)
only once (remove the duplicate init_tracking(True)), update disable() to call
init_tracking(False) instead of True, and simplify the echo messages in both
enable() and disable() (use plain string literals rather than f-strings) while
keeping the function names enable and disable and the init_tracking and
typer.echo calls intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d096bb3d-4186-4bf9-aff6-61de38a575e8

📥 Commits

Reviewing files that changed from the base of the PR and between 7db86c0 and 2530aed.

📒 Files selected for processing (3)
  • comfy_cli/command/custom_nodes/command.py
  • comfy_cli/tracking.py
  • tests/comfy_cli/test_tracking.py

The pytest job installs deps unpinned, so CI resolves click 8.4 / typer
0.26 / rich 15 while the uv.lock dev env stays on older pins. Three
breakages surfaced only under the newer libs:

- help_json: TyperGroup no longer subclasses click.Group (its MRO is now
  TyperGroup -> click.Command), so isinstance(cmd, click.Group) was false
  and the whole command tree came back empty. Duck-type list_commands/
  get_command instead. Fixes help_json + discovery tests.
- test_pr: click >= 8.2 keeps stderr out of CliRunner result.stdout; the
  --pr conflict error is on stderr. Assert against result.output (works on
  both old and new click).
- test_manager_gui: two find_cm_cli tests didn't pin workspace_path=None,
  so they took the workspace-Python subprocess branch and skipped find_spec
  when a workspace was configured. Pin it like the sibling test.

Amp-Thread-ID: https://ampcode.com/threads/T-019ebe75-cced-75d2-a7ac-4cce7e22d0f4
Co-authored-by: Amp <amp@ampcode.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/comfy_cli/command/test_manager_gui.py (2)

82-86: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Loose assertion on captured output may mask behavior mismatches.

Line 86 asserts that the captured output contains either "unknown-mode" (lowercased) or the literal string "Unknown manager mode". This disjunctive assertion is permissive and could hide cases where the actual output differs from both expectations—for example, if the implementation emits a warning with different wording or casing.

Consider checking the actual implementation of _get_manager_flags() and tightening the assertion to match exactly what the code produces.

💡 Tighten the assertion (draft)

Once you've confirmed the exact output string from the implementation:

-        assert "unknown-mode" in captured.out.lower() or "Unknown manager mode" in captured.out
+        assert "Unknown manager mode: unknown-mode" in captured.out

(Adjust the expected string based on the actual implementation output.)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/comfy_cli/command/test_manager_gui.py` around lines 82 - 86, The test
test_unknown_mode_returns_default_with_warning uses a loose disjunctive
assertion on captured output which can mask mismatches; open the implementation
of _get_manager_flags(), determine the exact warning string it emits for an
unknown mode (exact text and casing), then replace the current or-check in the
assertion with a precise check that matches that exact string (e.g., assert
expected_msg in captured.out or assert captured.out.strip() == expected_msg)
inside test_unknown_mode_returns_default_with_warning so the test fails if the
wording or casing changes.

1091-1115: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Test structure is sound, but consider cache-clearing robustness.

The test test_find_cm_cli_cache_behavior at lines 1091–1115 correctly verifies that find_cm_cli() caches results after the first call. However, calling cache_clear() at line 1102 inside the test method (after already patching and importing the function) assumes that cache_clear() is always available on the imported function. If the function implementation or import order changes, this could become fragile.

A minor refinement: ensure cache_clear() is explicitly verified to exist (or gracefully handle its absence) before relying on it in all tests that use it.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/comfy_cli/command/test_manager_gui.py` around lines 1091 - 1115, The
test assumes find_cm_cli has a cache_clear attribute; make the test robust by
checking for and calling it only if present: after importing find_cm_cli in
test_find_cm_cli_cache_behavior, verify hasattr(find_cm_cli, "cache_clear") (or
use getattr with default None) and call cache_clear() only when available,
otherwise proceed without error; this ensures the test won't fail if find_cm_cli
loses or changes its lru_cache decoration while preserving the rest of the
assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@tests/comfy_cli/command/test_manager_gui.py`:
- Around line 82-86: The test test_unknown_mode_returns_default_with_warning
uses a loose disjunctive assertion on captured output which can mask mismatches;
open the implementation of _get_manager_flags(), determine the exact warning
string it emits for an unknown mode (exact text and casing), then replace the
current or-check in the assertion with a precise check that matches that exact
string (e.g., assert expected_msg in captured.out or assert captured.out.strip()
== expected_msg) inside test_unknown_mode_returns_default_with_warning so the
test fails if the wording or casing changes.
- Around line 1091-1115: The test assumes find_cm_cli has a cache_clear
attribute; make the test robust by checking for and calling it only if present:
after importing find_cm_cli in test_find_cm_cli_cache_behavior, verify
hasattr(find_cm_cli, "cache_clear") (or use getattr with default None) and call
cache_clear() only when available, otherwise proceed without error; this ensures
the test won't fail if find_cm_cli loses or changes its lru_cache decoration
while preserving the rest of the assertions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6a12dbbd-5940-48fc-94a6-f34181cdd37b

📥 Commits

Reviewing files that changed from the base of the PR and between 7b4fec3 and 509554e.

📒 Files selected for processing (3)
  • comfy_cli/help_json.py
  • tests/comfy_cli/command/github/test_pr.py
  • tests/comfy_cli/command/test_manager_gui.py

- help_json: TyperOption/TyperArgument no longer subclass click.Option/
  Argument (MRO is -> click.Parameter), so isinstance(param, click.Option)
  dropped flags/envvar from every option in the --help-json contract.
  Duck-type via param.param_type_name == 'option' instead.
- test_jobs: the two 'flag visible in help' tests scraped rich-rendered
  --help text, which wraps/styles differently under the CI terminal and
  intermittently hid the flag. Assert against the render-independent
  build_help_json contract (the surface agents actually consume).
- test_project_command: pin 'project init --where cloud' so the marker
  default doesn't fall through to credential auto-detect (cloud on a dev
  box with creds, local on a clean CI runner).

Amp-Thread-ID: https://ampcode.com/threads/T-019ebe75-cced-75d2-a7ac-4cce7e22d0f4
Co-authored-by: Amp <amp@ampcode.com>
@codecov

codecov Bot commented Jun 13, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 70.47091% with 2132 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
comfy_cli/command/jobs.py 51.08% 338 Missing ⚠️
comfy_cli/command/setup.py 15.69% 247 Missing ⚠️
comfy_cli/command/nodes.py 47.49% 178 Missing ⚠️
comfy_cli/cmdline.py 33.99% 167 Missing ⚠️
comfy_cli/cql/engine.py 81.44% 152 Missing ⚠️
comfy_cli/cloud/command.py 18.49% 119 Missing ⚠️
comfy_cli/command/workflow.py 69.29% 105 Missing ⚠️
comfy_cli/command/models/search.py 68.00% 80 Missing ⚠️
comfy_cli/command/run/__init__.py 78.77% 76 Missing ⚠️
comfy_cli/command/project.py 61.62% 66 Missing ⚠️
... and 30 more
@@            Coverage Diff             @@
##             main     #470      +/-   ##
==========================================
- Coverage   83.36%   76.55%   -6.82%     
==========================================
  Files          45       95      +50     
  Lines        6848    14793    +7945     
==========================================
+ Hits         5709    11325    +5616     
- Misses       1139     3468    +2329     
Files with missing lines Coverage Δ
comfy_cli/auth/__init__.py 100.00% <100.00%> (ø)
comfy_cli/caller.py 100.00% <100.00%> (ø)
comfy_cli/command/__init__.py 100.00% <100.00%> (ø)
comfy_cli/command/custom_nodes/cm_cli_util.py 95.49% <100.00%> (+0.08%) ⬆️
comfy_cli/command/generate/client.py 77.77% <100.00%> (-0.49%) ⬇️
comfy_cli/command/generate/output.py 92.10% <100.00%> (+0.32%) ⬆️
comfy_cli/command/generate/schema.py 81.33% <100.00%> (+0.12%) ⬆️
comfy_cli/command/generate/spec.py 91.08% <ø> (ø)
comfy_cli/command/install.py 81.67% <100.00%> (ø)
comfy_cli/command/models/models.py 72.91% <100.00%> (ø)
... and 60 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

skishore23 and others added 8 commits June 12, 2026 19:35
Concurrency/robustness:
- cancellation: double-checked locking for singleton token; restore prior
  SIGINT handler on test reset
- oauth: set callback event even if response write fails; bind callback
  server on port 0 to remove pick-then-bind race
- jobs: bracket IPv6 hosts in URLs; sort terminal jobs newest-first; only
  POST /interrupt when the prompt is actually running; bound 'jobs watch'
  for an unknown prompt instead of hanging
- jobs_state: malformed state files return None instead of raising TypeError
- locking: Windows _acquire seeks to offset 0 before locking

Security/correctness:
- cql engine + loader: replace naive '127.' prefix loopback guard with
  ipaddress.is_loopback (SSRF bypass); validate object_info shape; scan all
  entries for shape detection; don't classify every 2-item list as an edge;
  isolate shared subgraph defs at every hop (deep-nest aliasing); drop a
  tautological exact_paths guard
- cql errors: preserve canonical runtime_message in as_details()
- project: skip symlinks/paths escaping assets/; chunked sha256 hashing
- generate: guard emit FS write failures; reject list-valued image params
- models show: paginate so exact matches beyond the first page are found

Fragments:
- validate ports/node IDs at parse; don't coerce invalid _fragment sections
  to empty dicts; avoid <name>.json.json; enforce unique foreach item keys
- workflow_fragments: guard compose I/O with a registered compose_io_error

Misc:
- nodes: harden _resolved_where() against corrupt config
- run/preflight: wire target_label into the error; drop dead timeout param
- transfer: chain the exception with 'from'
- cloud/command: drop redundant ConfigManager re-import
- command package: export project

Skipped (with rationale): rename auth store.set (public API churn); add ':'
to run host blocklist (breaks IPv6/host:port); remove run verbose/local_paths
& run_cli pause_seconds (part of stable signatures / demo handler).

Amp-Thread-ID: https://ampcode.com/threads/T-019ebe75-cced-75d2-a7ac-4cce7e22d0f4
Co-authored-by: Amp <amp@ampcode.com>
enable() called init_tracking(True) twice — removed the redundant second
call. Both messages used pointless f-strings with literal-only placeholders;
simplified to plain strings. (disable() already correctly passed False, so
CodeRabbit's 'disable enables tracking' claim was stale.)

Amp-Thread-ID: https://ampcode.com/threads/T-019ebe75-cced-75d2-a7ac-4cce7e22d0f4
Co-authored-by: Amp <amp@ampcode.com>
The default project status fails on any total-coverage drop, which blocks
landing large feature surfaces (thin, lightly-tested CLI command wrappers).
Set an explicit 70% floor instead (current total ~77%) and disable the patch
status. Still a real gate; raise as coverage backfills.

Amp-Thread-ID: https://ampcode.com/threads/T-019ebe75-cced-75d2-a7ac-4cce7e22d0f4
Co-authored-by: Amp <amp@ampcode.com>
The CLI auto-selects JSON output when stdout isn't a TTY (rule 6 in the
renderer), and subprocess pipes are never TTYs — so 'manager uv-compile-default'
routed its human message to stderr, leaving stdout empty and failing
test_uv_compile_default_config on all platforms. These e2e tests assert on the
pretty/human surface, so force pretty mode in exec(); the JSON auto-selection
path stays covered by tests/comfy_cli/output unit tests.

Amp-Thread-ID: https://ampcode.com/threads/T-019ebe75-cced-75d2-a7ac-4cce7e22d0f4
Co-authored-by: Amp <amp@ampcode.com>
os.fchmod is POSIX-only; on Windows accessing it raises AttributeError,
which the existing 'except OSError' didn't catch — crashing 'comfy run'
(test_run) on windows-latest with 'module os has no attribute fchmod'.
Guard with hasattr(os, 'fchmod'); Windows uses ACLs, not mode bits.

Amp-Thread-ID: https://ampcode.com/threads/T-019ebe75-cced-75d2-a7ac-4cce7e22d0f4
Co-authored-by: Amp <amp@ampcode.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s built

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d creative-presentation skills

Agent-first improvements across discovery, errors, workflow composition, and skills.

Composition
- decompose: project any workflow into a reusable fragment (the inverse of
  compose). Decomposed fragments are self-documenting — they record `source`
  and a how-to-edit `description`, with named/typed params derived from node
  titles. Skill directives steer agents to edit source, not the compiled
  artifact, and to decompose an existing workflow rather than re-author it.

Truthful schemas (the recurring schema-vs-cloud trap)
- Preserve combo option types when parsing object_info, so `nodes show` reports
  the real ints (Sora-2 `duration` → [4,8,12], not ["4","8","12"]). Local
  validate stays lenient; the displayed schema now matches what the cloud
  accepts. Enum-rejection errors carry the full, typed `valid_options` — no
  truncation, so agents pick a real value instead of guessing.

Errors as navigation signals
- renderer.error inherits a code's REGISTERED hint when the call site omits one
  (None or blank), so every error points toward the fix — no dead ends. A
  guardrail test enforces a navigation hint on every registered code (one
  terminal allowlist). Filled the 5 dead-end codes.

New command
- `comfy preview`: render a previewable PNG from any media — image -> thumbnail,
  video -> contact sheet, audio -> waveform — so results are shown, not
  described. Schema + error codes registered.

Fixes
- `generate --download` now saves the file in `--json` mode (was a forced curl).
- `generate list` / `generate schema` now honor `--json`.
- `validate` no longer warns on its own `_meta` provenance block.

Skills
- Rewrote comfy-relay into a creative-presentation + async/subagent playbook
  (show-don't-tell, fan-out, one subagent per shot). comfy gains presentation,
  compile-model, decompose-first, and one-graph cloud-handoff guidance.

All changes TDD'd; full suite green (2303 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

skishore23 and others added 4 commits June 16, 2026 12:46
`comfy update` and `comfy install` shelled out to `python -m pip install …`
against the resolved workspace interpreter, which a uv-managed venv often has no
pip for → `No module named pip` → CalledProcessError.

Add `ensure_pip(python)` (in uv.py): an idempotent, one-time bootstrap that makes
`python -m pip` usable before a pip step. It is a no-op when pip is present, so
every existing pip-install line stays byte-identical and the installer's
behaviour is unchanged. When pip is missing it bootstraps via uv
(`uv pip install --python <python> pip`) or `ensurepip`. The pure dispatch
(`pip_bootstrap_cmd`) is unit-tested.

Wired as a guard in front of the pip steps — never inside them — so the install
sites keep their `check=False` + custom return-code handling and their mocked
tests:
- update: before the requirements install
- install (normal path): before the deps install (covers torch + requirements
  + the manager that follows)
- install (--fast-deps path): before the separate manager install (uv env,
  pip-less — previously crashed)

Tests: pure dispatch covered in tests/uv; flow tests (python-resolution,
manager-gui) stub `ensure_pip` since they use a fake interpreter. Full suite
green (2306 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Multi-job wait: block until ALL given prompt_ids reach a terminal state, then
emit a summary envelope (completed/failed/cancelled/timed_out + per-job rows).
Replaces hand-rolled "poll each id until terminal" loops in agent pipelines.

- Emits a `settled` NDJSON event as each job finishes (stream mode).
- Exit codes mirror `jobs watch`: any error -> 1, cancelled -> 130, timeout ->
  1 (code `wait_timeout`); all completed -> 0.
- `--all` waits on every locally-tracked non-terminal job; `--timeout` bounds
  the wait; `--poll-interval` defaults to 5s (these are long jobs — don't
  hammer the server).
- Local path falls back to the on-disk state file when the server is down;
  cloud path polls the job status API.

6 unit tests: the core wait loop (settle + timeout) and the command's exit
codes (all-completed, any-error, no-ids).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…alogue lip-sync

Lessons hardened from a multi-film production session.

comfy/SKILL.md
- Skill-family pointer up front (skim the siblings before a big task).
- Multi-stage orchestration: the deliberate per-shot fan-out exception for
  partner-API video/avatar films — submit N -> `comfy jobs wait` all -> download
  -> ffmpeg conform in shell — distinct from the one-graph/foreach rule.
- Video: KlingAvatar lip-sync is by construction and it pads video past audio,
  so trim per-clip before concat or the voice drifts.
- Audio: cloud LoadAudio + VHS_LoadAudioUpload are upload-blind (generate audio
  in-graph); eleven_v3 + performance tags + low stability for emotion
  (model.style caps at 0.2); FB_Qwen3TTSVoiceDesign for accents (instruct+seed).
- Feedback: nudge for friction/papercut feedback after a high-friction session.

comfy-director/SKILL.md
- Dialogue must lip-sync by construction; cut between angles, never B-roll over VO.
- Lock voice/casting by EAR before the expensive avatar renders.
- Verify the mix by measurement (ffmpeg volumedetect), not faith.
- New continuity-toolkit and common-mistake rows.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CI (the agent-CLI contract ratchets) failed on the new `comfy jobs wait`:
- test_every_raised_code_is_registered: `no_prompt_ids` was raised but not in
  the error-code registry.
- test_every_emitted_command_registers_a_schema: `jobs wait` emits an envelope
  but had no entry in discovery.COMMAND_SCHEMAS.

Fixes:
- Register `no_prompt_ids` and `wait_timeout` in error_codes.py (with hints).
- Emit the failure codes as string literals (execution_error / cancelled /
  wait_timeout) so the AST-scan ratchet sees them — the previous `code=<var>`
  form was invisible to the scan, which would have left `wait_timeout` an
  unregistered runtime code.
- Add schemas/jobs_wait.json (the batch summary shape — counts + per-job
  terminal states) and map `comfy jobs wait` -> `jobs_wait`; `jobs wait` emits a
  different shape than ls/status/watch (no host/port), so it gets its own schema.
- Test that the summary payload validates against the schema.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread tests/e2e/verify_tracking_live.py
@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Jun 21, 2026
skishore23 and others added 4 commits June 21, 2026 16:28
It's an end-to-end verification, so it belongs alongside the other e2e
checks. Kept the non-`test_` prefix so pytest doesn't collect it — it's a
manual, opt-in, env-gated script (real PostHog round-trip, never in CI), run
directly via `python tests/e2e/verify_tracking_live.py`. Updated the usage
path in its docstring; nothing else referenced the old location.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A new user had no obvious path to the setup wizard: bare `comfy` led with
`comfy cloud login`, and jumping straight to a command nudged nothing.

- intro_banner now leads its quick-start with `comfy setup` and the
  not-signed-in line points at the wizard (one step: routing + sign-in + skills).
- First-run nudge: the first time an unconfigured (not-signed-in) user runs any
  real command, print a one-line "Run `comfy setup`" hint to stderr, once
  (config-flag gated). Hard-gated to interactive pretty output only — never in
  JSON mode, never on a non-TTY, never for signed-in/already-nudged installs,
  and swallow any error so it can't break a command.

Tested: banner leads with setup; nudge fires once then stays quiet; signed-in
users never nudged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…lope

Piping bare `comfy` (non-TTY) auto-selected JSON and emitted a useless welcome
*envelope* — bad UX for a human redirecting/paging output, and no use to an
agent (agents read `discover`/`--help-json`, never bare `comfy`).

The welcome screen is now JSON only when machine output is actually requested:
explicit `--json`/`--json-stream`, `COMFY_OUTPUT=json|ndjson`, or a REAL detected
agent (caller kind agent/claude-code/explicit label) — not merely because stdout
isn't a TTY. `detect_caller` flags a bare pipe as agentic (kind="pipe"), so the
fix keys on kind != "pipe" to tell a real agent from a human redirect, and
renders the banner straight to stdout regardless of the resolved mode.

Auto-JSON-on-non-TTY for actual commands (run/jobs/which/...) is unchanged — an
agent shelling out still gets JSON. Only the bare welcome is overridden.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Follow-up to the move (52bdb18): update the in-docstring run command from
scripts/ to tests/e2e/.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@skishore23 skishore23 merged commit 64c9883 into main Jun 22, 2026
15 of 16 checks passed
@skishore23 skishore23 deleted the agent-cli branch June 22, 2026 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants